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Abstract 

OO ' 

— ». . A delay-constrained scheduling problem for point-to-point communication is considered: a packet of B bits must 

("^ ' be transmitted by a hard deadline of T slots over a time-varying channel. The transmitter/scheduler must determine 

^N| . how many bits to transmit, or equivalently how much energy to transmit with, during each time slot based on the 

current channel quality and the number of unserved bits, with the objective of minimizing expected total energy. 

In order to focus on the fundamental scheduling problem, it is assumed that no other packets are scheduled during 

this time period and no outage is allowed. Assuming transmission at capacity of the underlying Gaussian noise 

channel, a closed-form expression for the optimal scheduling policy is obtained for the case T = 2 via dynamic 

programming; for T > 2, the optimal policy can only be numerically determined. Thus, the focus of the work is 

on derivation of simple, near-optimal policies based on intuition from the T — 2 solution and the structure of the 

r^ ' general problem. The proposed bit-allocation policies consist of a linear combination of a delay-associated term and 

an opportunistic (channel-aware) term. In addition, a variation of the problem in which the entire packet must be 

transmitted in a single slot is studied, and a channel-threshold policy is shown to be optimal. 



I. Introduction 
> ; 

CN i A time-varying channel is a fundamental feature of wireless communication. In this context, opportunistic 
^ ■ scheduling refers to the idea of transmitting with more power/higher rate when the channel quality is good and 
^T) less power/lower rate when the channel is in a poor state. While this strategy is efficient from the perspective 

. I , of long-term average rate, it is not necessarily appropriate for delay-constrained traffic which requires guaranteed 

Q ' short-term performance. 

OO In this paper we consider the problem of transmitting a packet of B bits over T time slots, where the channel 

IT. fades independently from slot to slot and the transmitter has perfect causal channel information (i.e., knowledge 
. ^ of the current channel, but not of the future channel). During each slot, the transmitter (or scheduler hereafter) 
S^ . determines how many bits to transmit based on the current channel quality and the number of bits yet to be served. 
H ' The scheduler must balance the desire to be opportunistic, i.e., wait to serve many of the bits when the channel is in 
■ - - ' a good state, with the hard deadline. We investigate the setting where there is a single packet to be transmitted (i.e., 
no other packets are scheduled during the T slot delay horizon), the packet must be transmitted by the deadline, 
and transmission occurs at capacity of the underlying Gaussian noise channel. In this framework our objective is 
to design a scheduling policy that minimize the expected energy consumed. This setup reasonably models delay- 
constrained applications such as VoIP, where packets arrive regularly and each must be received within a short 
delay window. In such a setting perhaps the most important design objective is to minimize the resources (in our 
case, energy) needed to meet the delay requirements. In the cellular uplink, for example, an energy-minimizing 
policy would extend the battery life of mobile terminals. 

A. Prior Work 

Delay constrained scheduling in wireless communication systems has been actively studied in various network 
settings under different traffic models and delay constraints (see for example |[ll0l|3ll|4l|[5l||6l||7lll8l and references 
therein). In |1UI121|13|, power/rate control policies that minimize average delay are studied for a fading channel 
with random packet arrivals. In ||4J |^ |,6J ITJ tSJ systems with random packet arrivals, hard delay constraints, and 
general energy-rate relationships are studied, but the emphasis is on "offline" algorithms in which the scheduler 



has non-causal knowledge of the packet arrivals and the channel states; heuristic variations of the optimal "offline" 
algorithms are also proposed for the more challenging "online" (i.e., causal) setting. 

In this paper, we rather focus on the interplay between fading, hard deadlines, and causal channel information 
by studying transmission of only a single packet, and thus do not consider random arrivals. Not only is this model 
more tractable, but it also more reasonably models applications with deterministic packet arrivals, e.g., VoIP or 
video streaming. To emphasize our treatment of physical-layer issues, we use the terms causal and non-causal 
rather than online and offline to indicate whether the scheduler has knowledge of future channel states. Recently, 
Fu et al. in considered this problem (single packet transmission over a block fading channel, subject to a hard 
deadline) and formulated it as a finite-horizon dynamic program (DP). For general energy-bit functions this DP 
can only be solved numerically, but in f9l a closed-form description of the optimal policy is derived for the special 
case where the energy-bit relationship is linear and the channel state is restricted to be an integer multiple of some 
constant. In this work we specialize the framework of |9| to the case where the energy-bit relationship is governed 
by the AWGN channel capacity formula, and derive closed-form descriptions of the optimal policy for T = 2 and 
sub-optimal policies for T > 2. In HOl the work of [9] is extended to a setting where the channel evolves according 
to a continuous Markov process, and the optimal scheduler is derived for the case where the energy-bit relationship 
is given by the AWGN capacity formula under particular assumptions on the channel model (channels with drift). 
However, these results do not apply to the block fading model considered here and the policies are rather different 
in structure from those developed here. 

In an earlier work, Negi and Cioffi 111] studied the dual problem of maximizing the expected number of 
transmitted bits in a finite number of slots subject to a finite energy constraint (with the energy-bit relationship 
described by the AWGN capacity formula). The optimal policy can generally only be found by numerical methods 
(although a threshold policy is found to be optimal at low SNR), and thus the solutions give little insight into 
how the scheduling parameters (e.g., channel state, number of bits to serve, number of slots remaining toward 
the deadline, and the like) affect the scheduling process. Although we deal primarily with suboptimal scheduling 
policies, we are able to deduce the effect of these parameters on the optimal policy. 

B. Summary of Contribution 

In this paper, we develop low-complexity and near-optimal scheduling policies for delay-constrained causal 
scheduling. Our main result is the following scheduler: a time-dependent weighted sum of a delay associated term 
and an opportunistic term as 

bt= -Pt +— — log— , (1) 

^^^ t Vt 

delay associated opportunistic 

where bt is the number of bits to serve (from the remaining f3t bits) at time slot t (t is in descending order and 
thus represents the number of remaining slots), gt denotes the current channel state, and Tjt denotes a channel 
threshold determined by the channel statistics and the particular policy. If the current channel quality is equal to 
the threshold level, then a fraction j of the remaining bits are transmitted. If the channel quality is better/worse 
than the threshold, then additional/fewer bits are transmitted. The scheduler acts very opportunistically when the 
deadline is far away (t large) but less so as the deadline approaches. The motivation of this form was raised from 
the simple T = 2 case, for which this form is shown to be optimal. 

Two different suboptimal policies in the form of ([T]) are proposed, one through a simple extension of the optimal 
T = 2 scheduler and the other by solving a relaxed version of the optimization. Numerical results are presented 
to illustrate that these policies provide a significant advantage over a naive equal-bit policy, and that they perform 
quite close to the optimal for moderate/large values of B. In addition, we consider the case of one-shot allocation 
where the entire packet must be transmitted in only one of the slots. This is an optimal stopping problem, from 
which it follows that a simple channel threshold policy is optimal. 

This paper is organized as follows. Section II describes the problem formulation. Section III discusses the optimal 
scheduler and Section IV develops suboptimal schedulers and their general framework that gives an insight on the 
algorithm structure that reveals the incoiporation of the delay constraint on the scheduling process. Section V 
provides analysis and simulations. Section VI considers the one-shot allocation problem. We conclude in Section 
VII. 
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Fig. 1 : Single-user delay constrained scheduling 



Notations: The operation E[^] for a random variable X denotes the expected value. The operation (Q[X] for a 
random variable X denotes e'^I'"^' and the function (G(xi, • • • , Xm) for deterministic quantities xi, ■ • • , Xm denotes 
the geometric mean (nS=i Xi)^l^. The operation {-Yx denotes truncation from below at x and truncation from above 
at y. The function 1|.} denotes the indicator function, i.e., its value is 1 if the argument is true and otherwise. 
The sets M+ and M++ denote the set of non-negative numbers and the set of positive numbers, respectively. 



II. Problem Formulation 

We consider a single-user delay constrained scheduling problem as illustrated in Fig. [U a packet of B bits 
must be transmitted within T time slots through a fading channel, in which T is referred to as the delay-limit or 
deadline. We assume no other packet is scheduled during the T time slots, and that the packet must be transmitted 
by the deadline (i.e., no outage is allowed). Although these two assumptions may not be entirely realistic, even for 
relatively deterministic traffic (e.g., in VoIP, the next packet generally arrives before the deadline of the previous has 
expired; furthermore, a small percentage of packets are allowed to miss their deadlines), these set of assumptions 
allow for a relatively tractable problem and allow us to focus on the central issue of meeting deadlines based upon 
causal channel information. The purpose of the scheduler is to determine the energy, or equivalently the number 
of bits, to be served during each time slot such that the expected energy is minimized and the bits are served by 
the deadline T. 

Time is indexed in descending order, i.e., t = T is the initial slot, f = T — 1 is the 2nd slot, . . ., and t = 1 is the 
final slot before the deadline; in doing so, t represents the number of remaining slots. The channel state, in power 
units, is denoted by gt. We assume that the channel states {gtYt^i are independently and identically distributed 
(i.i.d.) and the scheduler has causal knowledge of these channel states (i.e., at time t, qt^Qt-i • • • ^Qt are known 
but gt-i, ■ ■ ■ , 5i are unknown). In this context, we refer to this type of scheduler as a causal scheduler. The channel 
state g is assumed to be a non-degenerate positive continuous random variable. 

Assuming unit variance Gaussian additive noise and transmission at capacity, the number of transmitted bits, 
denoted as bt, if Et energy is used is given by ht = log2(l + QtEt). By solving for Et we arrive at a formula for 
the energy cost in terms of the channel state gt, and the number of bit^j served 6j: 

2^' — 1 

Et{bt,gt) = . (2) 

9t 

We use fit to denote the queue state; i.e., the remaining bits at time slot t. Then, j3t can be calculated recursively 
as f3t = Pt+i — bt+i- Given this setup, a scheduler is a sequence of functions {bt}f^i that maps from the remaining 
bits and the current channel statelj to the number of bits served, i.e., bt : M+ x M+_(_ -^ [0, /9t]. Then, the optimal 
energy-efficient scheduler is the set of scheduling functions {6°^'(-, ■)}J=i that minimizes the total expected energy 
cost (summed over the T slots): i.e.. 
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(3) 



'An implicit assumption is that each slot spans n channel symbols, for n reasonably large, and that powerful coding allows for transmission 
of nbt bits in the t-th slot. Thus, the quantity bt should be thought of as the number of bits transmitted per channel symbol during the t-th 
scheduling slot. 

^Because the channel states are assumed to be i.i.d., it is sufficient to make scheduling decisions based only on the current channel 
(while ignoring past channels). If channels are correlated across time slots, then the past and present channel should be used to compute the 
conditional distributed of future channel states and all expected future energy costs should be computed with respect to these conditional 
distributions. 



subject to X]t=i bt = B and bt > for all t. 

The optimization in (O can be formulated sequentially (via dynamic programming) with the remaining bits Pt 
as a state variable that summarizes the bit allocation up until the previous time step. 
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(4) 



This is the standard backward iteration: we first determine the optimal action at i = 1, then find the optimal policy 
at t = 2 by taking into account the optimal policy to be used at t = 1, and so forth. Since gt is known but future 
channel states gt-i, • • • , 9i are unknown, the quantity Et is not random but the future energy costs -Et_i, . . . ,Ei 
are random. Note also that the optimization ^ should be performed for all possible values of (3t and gt. In other 
words, deriving the optimal scheduling function 6°^ is equivalent to finding the optimal decision rule for all possible 
pairs {Pt,gt)- 

III. Optimal Scheduling 

In this section we attempt to derive the optimal (causal) scheduler using the conventional dynamic programming 
technique lITll . Unfortunately, an analytic expression is obtained only when T = 2 (besides the T = 1 trivial 
case). For T > 2, we discuss the difficulty in obtaining an analytic expression. When the scheduler has non-causal 
knowledge of the future channel states, however, deriving an optimal scheduler is possible; the optimal non-causal 
scheduler provides useful intuition and is derived in Appendix [A] 



A. Optimal Scheduler for T = 2 

In the final time slot (t = 1), the scheduler is required to transmit all /?i unserved bits regardless of the channel 
state gi, due to the hard delay constraint. Thus, the energy cost is given by Ei{/3i,gi) = (2^^ — l)/gi for all gi, 
and the expected cost to serve (3i bits in the final slot is E(,i [Ei{Pi,gi)] = E - (2^^ — 1). 

At i = 2, 52 is known but gi is unknown. The scheduler needs to determine 62, based on 52 and B, while 
balancing the current energy cost (of serving 62 bits in the current slot) and the expected future cost (of deferring 
B — b2 bits to the last slot). Thus, the optimum scheduler is the solution to the following minimization: 

/ \ 
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(5) 
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The objective function in ^ is convex, and therefore the minimizer is found by setting the derivative to zero while 
taking into account the constraints on 62: 

bT{B,g2) = (^B + Uog2{g2i^i)^ , (6) 

where z^i = E [1/5] is a constant that depends only on the distribution of the channel state g (see Appendix IB] for 
the definition of constants v^ for m = 1, 2, . . .). Note that this policy depends only on the unserved bits and the 
current channel state. This policy is only meaningful when vi is finite; this rules out Rayleigh fading, in which 
case g is exponentially distributed and thus E [I/5] is not finite. 

Notice that the optimal scheduling function Q has two additive terms: (a) ^B corresponds to an equal distribution 
to time slots t = \ and t = 2, and (b) ^ log2 {g2^i) associated with a measure of the channel quality at t = 2. That 
is, if the channel quality 52 is bigger than a threshold 1/z^i, then more bits are allocated than ^B; if gt is smaller 
than the threshold then fewer bits are allocated and more bits are deferred to the final slot. 



B. Optimal Scheduler for T > 2 

From (111), the optimization that the scheduler solves at each time step is: 
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where J°Z\(/3) = ]Kg[J°^i{P, g)] denotes the cost-to-go function, which is the expected cost to serve fi bits in (t — 1) 
slots if the optimal control policy is used at each step. This is a one-dimensional convex optimization (pp. 87-88 
in |[T3]| ) over ht and the optimal solution satisfies 
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'^g6{|7-Iir2(^r-i)'(/5-&)}, Jjf^ < 9t < j^y^y (8) 
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(J°^)'(0)' 

assuming J°^\ is differentiable (pp. 254-255 in HTM ), where arg^{-} represents the solutioijj of the argument 
equation. 

When t = 2, the cost-to-go function J°'"(/3) = (2^ — l)i^i (as well as its derivative) takes on a very simple form 
and thus ^ can be solved in closed form as in ^. However, the same is not true for t > 2. Because the optimal 
policy for t = 2 is known, the cost-go-to J2^'(/3) can be written in closed form. The derivative (J2^')'(/3) can also 
be written in closed form but cannot be analytically inverted; thus, the optimal policy for t = 3 can only be written 
in the form of dH) with the second condition given by the following fixed point equation: 



- = 2^-"'^ J^ u,dF{x) + 2^uf j^_^^ (^- j dF{x) + 2^--^ j^ -dF[x), (9) 

where F is the cumulative distribution function of the channel state g. As a result, no analytical characterization 
of J^^\p) is possible, and thus neither 6°'"(-, •) nor J°'"(/3) can be found in closed form for t > A. 

Alternately, we can numerically find the optimal scheduler by the discretization method ifTSl . However, large 
complexity and memory is required for sufficiently fine discretization. More importantly, this numerical method 
gives little insight on how the delay constraint and channel state affect the scheduling function. 

IV. SuBOPTiMAL Scheduling Policies 

Because the optimal scheduler cannot be written in closed form, it is of interest to develop suboptimal schedulers. 
The first scheduler is based on the intuition from the optimal T = 2 policy, and the second is found by solving a 
relaxed version of the optimization. 

A. Suboptimal I Scheduler 

If we compare the optimal causal scheduler for T = 2 (Section IIII-AI ) to the non-causal scheduler, we can 
immediately notice that the optimal scheduler determines 62^' by inverse-waterfilling over channels (72 and 1/z^i, 
where the non-causal scheduler inverse waterfills over g2 and the actual value of (^o This is because of the 
particularly simple form of the expected future cost. Although the expected future cost does not take on such a 
simple form for T > 2, we can get a suboptimal scheduler by simply applying this inverse-waterfilling at every 
time slot t. In other words, at time step t, perform inverse-waterfilling over the following t channels: 

1 1 

gt,— ,...,— 



t-i 

3t 



Because of the convexity, the solution exists uniquely if it exists, 
''when both 52 and gi are known at f = 2, the optimal non-causal scheduling policy is given by b^^^{B, g2) = (^B + ^ logj ( — 
from on . in which "IWF" stands for inverse waterfilling (see Appendix |A] for detail). 



to determine how many of the unserved Pt bits to serve now. We denote this bit allocation policy as b^ . Since 
t — 1 of the t channels are equal, the inverse-waterfilling operation is very simple and the policy is given by 

/ \ ^* 

bf\Pt,9t) = l]l3t + ^^log2^\ , (10) 

where ??J = l/i^i serves as the channel threshold. Notice that this threshold value depends only on the channel 
statistics and is constant with respect to t. 

When the deadline is far away (large t), the first term in (fTOl) is negligible and the bit allocation is almost 
completely dependent on the instantaneous channel quality. As the deadline approaches (t decreases toward 1), the 
weight of the channel-dependent second term decreases and the weight of the delay-associated first term increases. 

B. Suboptimal II Scheduler 

The inability to find a general analytic solution to the original optimization d?]) is due to complications caused 
by the constraint < bt < Pt (for each t) in the dynamic optimization. However, if we relax this constraint (i.e., 
allow 6j < and bt > fit while maintaining the constraint X]t=i ^t — ^) we can derive the optimal policy in closed 
form. 

If we define the function Lt as below, then we can show inductively that Lt represents the cost-to-go function 
for the relaxed optimization: 

Lt{l3t) = t2^ G{ut, ut^i,..., ui) - tui (1 1) 

where vi,V2,- ■ ■ are the fractional moments defined in Appendix |B] and G() represents the geometric mean operation 

defined in SectionJl When t = 1, (fTTl ) holds trivially. If we assume ([TT]) holds for t — 1, then the relaxed optimization 

for the next time step is given by 

/2^'-l \ 

mill + Lt„i(A-6t) (12) 

bt \ gt ) 

and the solution (i.e., the optimum scheduler for the relaxed problem) is found by setting the derivative of the 

objective to zero: 

1 t - 1 

^t = 7^t + —^ ^°S2 {9t G(i^t-i, . . . , vi)) . (13) 

By plugging in the optimum value of bt in (IT3l) into (fT2l) and taking expectation with respect to gt, we reach 
(ITT]) . By truncating the policy in ( |T3l ) at and /3f we get a policy, referred to as Suboptimal II, for the original 
(un-relaxed) problem: 



where 



^'^ = -, , (15) 



denotes the threshold that depends only on the statistics not the realizations. 

C. Remarks on the Suboptimal Schedulers 

From ( ITOl ) and (fT4l ). we can see that the two schedulers have a very similar form with the only difference term 
being the threshold rjt. Notice that both policies simplify to the optimal policy for t = 2. Based on the policy 
formulations, this subsection investigates the common and different characteristics of the suboptimal schedulers. 
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Fig. 2: Thresholds r/J^^ for the suboptimal I scheduler and rjf^^ for the suboptimal II scheduler when the channel 
state has the truncated exponential with 70 = 0.001. 



1) General Framework: The two algorithms thus far considered can be cast into a single framework: 
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(16) 



where -qt is the channel threshold determined by the individual algorithms. This simple allocation strategy reveals 
how the delay constraint affects the scheduling algorithms: at time step t serve a fraction 1/t of the remaining 
bits plus/minus a quantity that depends on the strength of the current channel compared to a channel threshold. If 
the current channel is good (i.e., gt is bigger than the threshold rjt), additional bits are served (up to f3t), while 
fewer bits are served when the current channel is poorer than the threshold. Furthermore, note that when t is large 
(i.e., far from the deadline), the first term j3t/t is very small and the number of bits served is almost completely 
determined by the current channel conditions. This agrees with intuition that we should make aggressive, almost 
completely channel dependent (and deadline independent) decisions when the deadline is far away, while we should 
make more conservative (more deadline dependent, less channel dependent) decisions near the deadline (small t). 
Using log2 10 w 3 we can rewrite the policy in dB units as: 
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For large t, approximately one bit is allocated for every 3 dB by which the channel exceeds the threshold. 

2) Channel Thresholds: The difference between the two policies is in the threshold values, which are illustrated 
in Fig. [2] for a particular channel distribution. The suboptimal I scheduler has a constant threshold r]i ' = l/i/i for 
all t, whereas Suboptimal II has a threshold that increases with t (by Proposition I). It is intuitive to use a larger 
threshold when the deadline is far away (large t), as the scheduler can be more selective because many different 
channels remain to be seen before the deadline is reached. 

By using a constant threshold, Suboptimal I is not selective enough and transmits too many bits when the deadline 
is far away. To see this, consider the average number of bits transmitted in slot t (ignoring truncation): 
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Because rj\^' = \/ui = 1/E[1/5], by Jensen's inequality E log2 nrr 

Suboptimal I transmits more than ^ bits on average when scheduling begins, which is in some sense overly 



E [logs gt] + logs K [1/5] > 0. Thus, 



TABLE I: WaterfiUing analogy. 





At each t, perform inverse-waterfilling over the following channels 


Equal-bit scheduler 


gt,gt,gt,--- ,gt 

V ' 

t-1 


Suboptimal I scheduler 


i i i 

gt, — , — ,■■■ , — 


t-1 


Suboptimal II scheduler 


„ i 1 1 


Non-causal IWF 


gt,gt-i,gt-2r ■■ ,gi 



aggressive. On the other hand, the quantity Eg^ 



log2 



9t/r]f^^ 



decreases as t increases and the limit is given by 
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(19) 



because of Proposition [T] This implies that the suboptimal n scheduler allocates B/T bits on the average when 
the deadline is far away and thus, unlike Suboptimal I, is not biased or overly aggressive. Numerical results given 
later support the fact that Suboptimal II generally performs better than Suboptimal I. 

D. Equal-bit Scheduler 

For comparison purposes, we consider one of the simplest causal schedulers: equal-bit scheduler. This policy 
allocates B/T bits in each time slot, regardless of channel conditions, i.e.. 



B 1 



K\f3,9t) = 7^ = -(3t. 



The con^esponding expected energy is given by 
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(21) 



Although equal-power scheduling is asymptotically optimal for the dual problem of maximizing rate over T 
slots when given a finite energy budget in the high power regime ifTTl . it will be seen that equal-bit scheduling is 
suboptimal even when B is large. 

E. Inverse WaterfiUing Interpretation 

If Suboptimal I and II and the equal-bit schedulers are compared to the optimal non-causal policy (inverse 
waterfilling), one can see that each of the algorithms mimics inverse waterfilling using either the current channel 
or channel statistics for the future channels, as summarized in Table H 

V. Analysis & Numerical Results 

In this section, we compare the performance of the optimal, Suboptimal I and II, and equal-bit schedulers. For 
T = 2 we are able to quantify the advantage of optimal scheduling relative to equal bit scheduling in two extreme 
cases, while for T > 2 we can only consider numerical results. 



A. Asymptotic Analysis for T = 2 

From the optimal scheduling expression for T = 2 given in ^, we can see that the packet is split over both 
time slots (i.e., < 62 < i?) if and only if 2~^ jvx < 92 < 2^/i^i. As i? ^ 0, the probability of this event goes 
to zero: if (72 < l/^i then all bits are deferred to the final slot, while if 52 > l/^i all bits are served at t = 2. As 
a result, the expected energy cost takes on a rather simple form as S — > (the derivation is provided in Appendix 
0: 
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TABLE II: Average energy offsets for T = 2 



distribution of channel state g 


equal-bit vs. optimal causal {Jl^{B)/J°^\B)) 


B^O 


B->oo 


truncated exponential with 70 = 0.1 


1.96 dB 


0.44 dB 


truncated exponential with 70 = 0.01 


3.26 dB 


1.04 dB 


truncated exponential with 70 = 0.001 


4.32 dB 


1.68 dB 


1x2 Rayleigh fading {g ^ xi) 


1.99 dB 


0.52 dB 


1x3 Rayleigh fading (g ~ Xe) 


1.37 dB 


0.27 dB 


1x4 Rayleigh fading (<? ~ Xs) 


1.10 dB 


0.18 dB 




B (bits) 
(a) Average total energy 




(b) Energy advantage of optimal relative to equal-bit scheduling 
(difference in dB) 



Fig. 3: Average total energy consumptions for T = 2 and average energy offset when 5 is a truncated exponential 
variable with threshold 70 = 0.001 



where = represents equivalence in the limit (i.e., the ratio between both sides converges to 1 as i? ^ 0). This implies 
that the corresponding effective channel is uiax{g2, 1/j^i)- On the other hand, when B —>■ 00 the probability of only 
utilizing one slot goes to zero and the limiting expected cost can be derived. The following theorem quantifies the 
power advantage of optimal scheduling: 

Theorem 1: The energy savings of optimal scheduling with respect to equal bit scheduling in extremes of B —^ 
and S ^ 00 is given by: 
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Proof: See Appendix ICl ■ 

Table Ull summarizes typical values of the energy savings (at the extremes of i? ^ and B —> cxd) for several 
fading distributions, as given by Theorem [T] As intuitively expected, the energy advantage is larger for more severe 
fading distributions. In other words, optimal scheduling is more beneficial in more severe fading environments. 

Figure [3] contains a plot of expected energy versus B for the optimal and equal-bit schedulers as well as a plot of 
the energy difference between the two schedulers as a function of B, for channel state g distributed as a truncated 
exponential with the threshold 70 = 0.001. The energy advantage is seen to decrease from its i? — > advantage of 
4.32 dB to the large B asymptote of 1.68 dB. 
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Fig. 4: Average total energy consumption for T = 5 and T = 50 



B. Numerical Results for T > 2 

Throughout the simulations, we assume that the channel state gt is a truncated exponential with parameter A 
and threshold 70 = 0.001. The factional moments of this truncated exponential variable can be calculated as: 



■Ae^^°Ei(A7o), 
A[e^7or(H^,A7o)]' 



m 



1, 



m > 1, 



denote the exponential integral and the incomplete gamma function, respectively, and its 



J_ -e^™Ei(A7o) 

10^ 



where Ei(-) and r( 
limit is given by u^^ 

Figures HH and St) compare the energy consumption of the four different algorithms (equal-bit, Suboptimal I and 
II, optimal causal) for T = 5 and T = 50, in which the optimal scheduler is calculated by numerical methods. 
The X-axis denotes the total number of bits B transmitted in T time slots, and thus B/T can be thought of as the 
average bits per channel use. The y-axis denotes the average total energy cost J^, Jj}, Jj, , and J^^'. 

From Fig. H^ we see that both Suboptimal I and II perform nearly as well as the optimal scheduler, although 
Suboptimal II performs better than I. There are significant differences between the equal-bit and optimal schedulers, 
which is to be expected given the time diversity available over the five time slots. In Fig. ^p we see even larger 
differences between equal-bit and optimal causal, which can be explained by the even larger degree of time diversity 
(T = 50). Furthermore, Suboptimal II significantly outperforms Suboptimal I for T = 50 due to the over-aggressive 
nature of Suboptimal I. Suboptimal II performs nearly as well as the optimal scheduler when B is approximately 
50 or larger (i.e., B/T > 1), but is sub-optimal for smaller values of B. 

Figure |5] shows the expected bit allocation E,[bt] for the different algorithms for T = 10 slots when B is large 
(B = 50, upper) and small (B = 2, lower). While the optimal causal scheduling policy allocates roughly an equal 
number of bits (averaged across different realizations, and not for each particular realization) to each of the slots, 
Suboptimal I is immediately seen to allocate too many bits (on average) to early time slots which agrees with 
our earher claim that this algorithm is often overly-aggressive as explained in Section IIV-C2I For i? = 50 the bit 
allocation of Suboptimal II is very similar to that of the optimal policy. However, for i? = 2 Suboptimal II is also 
overly-aggressive as compared to the optimal. We suspect that the performance of Suboptimal II could be further 
improved by performing some heuristic modifications to the algorithm, but this is beyond the scope of the paper 
and is left to future work. 

To summarize, the numerical results indicate that (a) Suboptimal II is nearly optimal for moderate to large values 
of B, (b) Suboptimal II outperforms Suboptimal I, and (c) neither suboptimal algorithm is near optimal for small 
values of B. In the next section, we will consider a policy that performs close to the optimal when B is small. 
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Fig. 5: Bit allocation profiles for T = 10 when B = 50 (upper) and B = 2 (lower) 



VI. One-shot Allocation 

In some settings it may be undesirable to split the packet across multiple time slots, e.g., because there is a large 
overhead associated with each slot used for transmission. In this scenario we may wish to find only one time slot 
among the T slots for the transmission of B bits; i.e., the action ht can be either or B. 

The dynamic program in this setting can be written as 

2^-1 



Ji{B) 
MB) 



91 



mm 



iB 



1 



9t 



E[Jt-iiB)] 



(25) 



(26) 



which is precisely an optimal stopping problem |fT2l|. Thus, a threshold policy is optimal: allocate all B bits at the 
first slot t such that gt > l/ujt, where 1/uJt is the threshold. That is. 



B, t = max {s : Qs > l/ojg} , 
0, elsewhere. 



(27) 



At t = 1 a packet must be served and thus ui is infinite. Because the expected cost-to-go decreases as t increases, 
the threshold also decreases with t. In Appendix O we show the thresholds are given by the following recursive 
formula. 



wt 



t 



t<^^- 



prU<^*- 



L}+Wi_iPr{i>cut_i}, t = 3,---,r. 



(28) 



Notice that the threshold l/wt depends only on the channel statistics and does not depend on B. 

Figure [6] illustrates the thresholds for the truncated exponential g (with A = 1 and 70 = 0.001) and the chi-squared 
g (with 4 degrees of freedom). Figure [V] illustrates the energy usage (normalized by T) of the optimal one-shot 
allocation policy and the multiple slot policies. The B /T = 0.1 and B /T = 1 curves illustrate performance for 
relatively small and large values of B, respectively. When B is small, the energy of the one-shot allocation is nearly 
the same as the optimal policy that allows for multiple slots to be used. However, this one-shot allocation is not 
appropriate when B is relatively large because the required energy of the one-shot policy grows exponentially with 
B. 
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Fig. 7: Performance of the optimal one-shot allocation compared with multi-slot allocation algorithms 



VII. Conclusion 

In this paper we considered the problem of bit/energy allocation for transmission of a finite number of bits over a 
finite delay horizon, assuming perfect instantaneous channel state information is available to the transmitter and that 
the energy and rate are related by the Shannon-type (exponential) function. We derived the optimal scheduling policy 
when the deadline spans two time slots, and derived two near-optimal policies for general deadlines. The proposed 
schedulers have a simple and intuitive form that gives insight into the optimal balance between channel-awareness 
(i.e., opportunism) and deadline-awareness in a delay-limited setting. We also considered the same problem under 
the additional constraint that only a single of the available time slots can be used, and in this case found the 
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optimal threshold-based policy. Based upon the policy constructions and the numerical results, we observed that the 
suboptimal II scheduler is near-optimal for large/moderate values of B while the one-shot policy is near-optimal 
for small values of B. 

Given the increasing volume of delay-limited traffic over packet-switched wireless networks (e.g., VoIP or 
multimedia transmission in 3G systems), we expect problems of this sort to become increasingly important. Of 
course, the problem considered here represents only a particular instance of the rich space of delay-limited scheduling 
problems. Interesting extensions include consideration of discrete code rates, peak power constraints, and multi-user 
issues, and we hope this work provides useful insight for some of these other formulations. 

Appendix A 
Non-causal Scheduling 

If the channel states are known non-causally, i.e., gr, • • • , 5i are known at t = T, the optimal scheduling/allocation 
is determined by waterfiUing because each time slot serves as a parallel channel. While conventional waterfiUing 
maximizes rate subject to a power constraint, this is the dual of minimizing power/energy subject to a rate/bit 
constraint and is referred to as inverse-waterfilling (IWF): 

JriB, te}f=i) = min Y: ^^^^> (29) 

subject to X]i=i h = B and bt > 0. This is a convex optimization problem and can be easily solved using the 
standard Lagrangian method: 

where ^th is the solution to X]i=i \l0g2 ( ~ ) ) = B. A time slot t is called utilized if a positive bit is scheduled 



at t, i.e., 6f > or equivalently gt > (7th- With algebraic manipulations, we can express this IWF policy in (1301 ) 
sequentially like other causal scheduling policies as 

C^(A, 9t) = ^ A + ^ log2 ^, if 5t > 5th, (31) 

/t-1 , xl/(t'-l) 

Otherwise b';^^{pt,gt) = 0, where t' = ^*^, l{g,>g,,} and ryJ^F = / g g.'-^'-M . Notice that gt^i, ■■■ ,gi 

are relatively future quantities at slot t. 

Like causal scheduling, the bit allocation process is described in two stages: first the remaining bits are divided 
equally amongst the active slots and then bits are added/subtracted depending on the channel state. 

Appendix B 
Channel Characterization by Fractional Moments 

We characterize the statistics of the channel states by using the fractional moments of the inverse of the channel 
states g. We define the following quantity for m = 1, 2, . . ., 



E 



(32) 



Then, the properties of these quantities are summarized as follows: 

Proposition 1: The channel statistics defined according to (l32l ) for a non-degeneratqj positive random variable 
have the following properties: 

(a) the sequence {vm} is strictly decreasing and the limit exists (denote the limit as i/qo), 

(b) the sequence {(fm^m-i • • • 1^1)^'™} is also strictly decreasing and its limit is also Voo- 

Proof: 

^This eliminates the delta-type density (point-mass) function. 
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(a) First, we show the sequence {um} is monotonically decreasing. Let Y = l/g and fyiv) be the pdf of Y. By 
the Holder's inequahty lfT6l . 



E 



y-:^! 



y"'+'fY{y)dy 



y-fYiy))'^^' {fY{y))'-+' dy 



-Ln \ ™ + l 



< I j^ y-fY{y)dy\ ^J^ fYiy)dyj = {e[Y- 

The strict inequahty is due to the fact that Y is not a point-mass density. Raising both sides to the power 
(m + 1) gives Um+l < l^m- 

Second, we show convergence of the sequence. Let 4>miy) = y~ for y > and Tp{y) = 1 + y for y > 0. 
Then, it is clear that linim^oo 4>m{y) = 1 for all y > 0, and < (j)m{y) < i^iy) for all y > 0. Additionally, 
f^ ip{y)fY{y)dy < cxd. By the dominated convergence theorem II16II . we have 

lim E[l"^] = lim / 0^(y)/y(y)(iy = / 1 • /y(y)(iy = L 

Let X be a positive real number. By the continuity of the logarithmic function, we have lim2^^^oliiE[^^] = 0. 
By L'Hospital rule, 

lnE[:i^1 ,. ElY^'lnY] 



lim 



lim 



E[lny] 



^^0 E[Y^] 

since lim2,.^oE[i^^] = 1 and lim2,.^oE[y^ Inl"] = E[lny] (due to the dominated convergence theorem). By 
the continuity of the exponential function, limaj^o e"'"^'^'^^^ = e'^^'"^^!. Since the above limit exists and x is 
in the superset of integers, we have the result, 
(b) The monotonicity of the sequence {{vm^m-i ■ ■ -1^1)^^"^} follows immediately from the monotonicity of the 
sequence {um} and its positivity. 



By the property of the exponential function, we have {I'm'^m-i ■ ■ ■ 1^1) 



— ln{i/mU„^-i---Ui) 



;^Er=iln<^,. 



Since lim 



m,— >oo '^m. 



1^00 and log is continuous, limm^oolni^m = Inz^oo- By Cesaro mean. 



lim — y In ; 

n=l 



'„ = Inz/oo- 



From the continuity of the exponential function, we have the result. 

■ 
Notice that ui and u^o represent the arithmetic mean and the geometric mean of random variable l/g, respectively. 
All other values in the sequence {um} lie between these two means. 



Appendix C 
Proof of Theorem [T] 

For simple derivation, we work in units of nats rather than bits. From Q, the energy cost can be derived as 



Thus, 



JT{92,B) = { 



< 



2ef i^ui 

e«-l 



1/2 



9-2 ^ 



u, <92<^, 



92 



92 > 



Vl 



JTiB) = EgAjTi92,B)] 

-B 

' {e^ - l)uidF{x) + 



1 \l/2 

2e^ ( -ui 



1 

X 



dF(x) 



00 gB 



X 



-dF{x), 



(33) 



(34) 
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where F is the cumulative distribution function (CDF) of the channel state g. 
By the limit rules, 






lim 
lim 



/o - (e^ - l)^idF(x) + /^ ^dF(x 



/„°°(e^-l)min(l,^i)dF(:r) 



E 



mm — , 1^1 
9 



and 

With dnj, we obtain (|23l). Likewise, 

jr(s) 



e^-l 



1™ 2(e-B/2 - 1) 



1. 



lim 



lim 



#n,,AV2 1 



/A 2eT(i^i) 



vi 



dF{x) 



B^oo 2e~{l/2Uiy/'^ B^oo 



2e~(z^2i^i)^/^ 



and 



Thus, we have shown (I241 ). 



2e2 (i/2i^i)^'^^ -2z^i 

lim 5 = 1 

B^oo 2e-(i/2i^i)^/^ 



:£(:B)_ 

R J- 



lim 
B— oo 2e'^iyi 



(35) 



(36) 



(37) 



(38) 



Appendix D 
Derivation of (|28]) 

From (l26l) the threshold u>t is related to the expected cost-to-go by u>t = 2^^-! K[Jt-i{B)]. The one-step cost- 



to-go is E[Ji{B)] = (2^-l)E 
LOt^i to give: 

1 



UJt 



2^-1 
1 



and therefore u;2 = E 

E[Jt-iiB)] 



For t > 2, we expand the cost-to-go in terms of 



E 



^B 



1 



gt-1 

E 



1 



< ^t-i 



9t-i 

E[Ji-2(S)] 



Pr 



9t-i 



< UJt-l } + 



9t-i 



< ^t-i 



Pr 



9t-i 



> wt-i 



By substituting E[Jt-2(^)] = (2^ - l)wt-i, we have the result. 
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